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Abstract 

In recent years many chaotic cryptosystems based on Baptista's seminal work have 
been proposed. We analyze the security of two of the newest and most interest- 
ing ones, which use a dynamically updated look-up table and also work as stream 
ciphers. We provide different attack techniques to recover the keystream used by 
the algorithms. The knowledge of this keystream provides the attacker with the 
same information as the key and thus the security is broken. We also show that 
the dependence on the plaintext, and not on the key, of the look-up table updating 
mechanism facilitates cryptanalysis. 
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1 Introduction 



Since M. S. Baptista proposed in 1998 a new cryptosystem based on the prop- 
erty of ergodicity of chaotic systems [1], a number of new algorithms based 
on variations of Baptista's have been published [2,3,4,5,6]. In [7] we analyzed 
the security and cryptographic robustness of Baptista's seminal algorithm. 
The first variation of Wong [2] was cryptanalyzed in [8]. We present in this 
Letter our results after having thoroughly studied Wong's second and third 
algorithms [4,5]. 

The ergodicity property is exploited in these algorithms by using the logistic 
map 

Vn+l = bynil - Vn) , (1) 
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where i/n G [0, 1] and the parameter b is chosen so that Eq. (1) behaves chaot- 
ically. In [4], the most interesting addition consists of using a dynamic table 
for looking up the ciphertext and plaintext which is no longer fixed during 
the whole encryption and decryption processes. Instead, it depends on the 
plaintext, being continuously updated during the encryption and decryption 
processes. This makes cryptanalysis more difficult, but not impossible. 

When the ith message block is encrypted, the look-up table is updated dy- 
namically by exchanging the ith entry li with another entry Ij . The location 
of the latter entry, i.e., the value of j, is determined by the current value of y 
using the following formula: 



V — 



1 X 



(2) 



j — i + v mod N, (3) 

where ^/min and Umax are the end points of the chosen interval [?/min,ymax) and 
N is the total number of entries in the table [4] . 

In [5] , the previously described chaotic cryptographic scheme is generalized by 
allowing the swapping of multiple pairs of entries in the look-up table during 
the encryption of each input block, and by allowing multiple runs of encryption 
on the whole message continuously. Starting from the current entry i, p pairs 
of entries (p > 1) are swapped according to the following rule: ^ {i + v 
mod N), (i + v + 1 mod N) ^ {i + 2v + 1 mod N), (i + 2v + 2 mod N) ^ 
{i + 3v + 2 mod N), . . . , {i + {p — l)v + p — 1 mod N) <-> {i + pv + p — 1 
mod N). Once the message has been encrypted, the whole process is repeated 
again r times, r > 1. The final look-up table is the hash of the message [5]. 



2 Cleissical types of attacks 



When crypt analyzing a ciphering algorithm, the general assumption made is 
that the cryptanalist knows exactly the design and working of the cryptosys- 
tem under study, i.e., he knows everything about the cryptosystem except the 
secret key. This is an evident requirement in today's secure communications 
networks, usually referred to as Kerchoff's principle [9, p. 24]. According to 
[9, p. 25], it is possible to differentiate between different levels of attacks on 
cryptosystems. They are enumerated as follows, ordered from the hardest type 
of attack to the easiest: 

(1) Ciphertext only: The opponent possesses a string of ciphertext. 

(2) Known plaintext: The opponent possesses a string of plaintext, p, and 
the corresponding ciphertext, c. 
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(3) Chosen plaintext: The opponent has obtained temporary access to the 
encryption machinery. Hence he can choose a plaintext string, p, and 
construct the corresponding ciphertext string, c. 

(4) Chosen ciphertext: The opponent has obtained temporary access to the 
decryption machinery. Hence he can choose a ciphertext string, c, and 
construct the corresponding plaintext string, p. 

In each of these four attacks, the objective is to determine the key that was 
used. The last two attacks, which might seem unreasonable at first sight, 
are very common when the cryptographic algorithm, whose key is fixed by the 
manufacturer and unknown to the attacker, is embedded in a device which the 
attacker can freely manipulate. Daily life examples of such devices are smart- 
cards, electronic purse cards, GSM phone SIM (Subscriber Identity Module) 
cards, POST (Point Of Sale Terminals) machines, or web apphcation session 
token encryption. 



3 Keystream attacks 

Although at first sight the cipher under study might look like a block cipher, 
in fact it behaves as a stream cipher [9, p. 20], a fundamental weakness as is to 
be seen. The operation of the algorithm as a stream cipher can be explained 
as follows. Suppose K is the key, given by uq and 6, and that p = piP2 • • • is 
the plaintext string. A keystream k = /ci/c2 ... is generated using Eq. (1). This 
keystream is used to encrypt the plaintext string according to the rule 

c = ejki(pi)efe2(p2) • • ■ = C1C2 . . . 

Decrypting the ciphertext string c can be accomplished by computing the 
keystream k given the knowledge of the key K and undoing the operations 
Cfe.. In [4] the keystream is the complete orbit followed by iterating Eq. (1) 
from the initial point with parameter near h = 4.0. The unit interval is 
divided up into equally spaced bins, each corresponding to a symbol of the 
alphabet in use. However, instead of considering the whole unit interval, only 
the subinterval [0.2,0.8) is used. As a consequence of the natural invariant 
density of Eq. (1), the orbit will visit frequently the forbidden subintervals 
[0,0.2) and [0.8,1]. 

As an example of how to generate the keystream, let us iterate Eq. (1) 
starting from yg = 0.1777 and b = 3.9999995, as in [4]. Let Si be the 
symbol corresponding to each useful subinterval where yi lands, and let 
X be the iterates which visit the forbidden subintervals. The orbit fol- 
lowed is yi ={0.5844. . . , 0.9714. . . , 0.1109. . . , 0.3945. . . , 0.9555. . . , 0.1698. . . , 
0.5641. . . , 0.9835. . . , 0.0647. . . , 0.2420. . . , . . . }, and is transformed into the 
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keystream k = siqsxxssaxxsi^^xxsis ■ ■ ■ Next wc show how to recover the 
keystream using chosen ciphertext, chosen plaintext, and known plaintext at- 
tacks. It is important to note that knowing the keystream k generated by a 
certain key K (yo and b) is entirely equivalent to knowing the key. Therefore, 
our keystream attacks focus on recovering k. 



3.1 How to circumvent the look-up table 

The election of the look-up table updating method is most unfortunate, since 
it allows the attacker to easily predict the new positions of the symbols even 
without the exact knowledge of the value of y. In order to initially simplify our 
analysis, we use a variable number of symbols, = 2", n = 1, . . . , 8. First, 
we assume that the source emits two different symbols, 5*2 = {si,S2}. When 
the orbit lands on the first subinterval [0.2,0.5), the table will be updated 
following Eqs. (2) and (3). Given that 0.2 < y < 0.5, we have that 



_ Z/max 2/min 

and thus v — and Eq. (3) is equivalent to: 

j — i mod 2. 

When the orbit lands on the second subinterval [0.5, 0.8), then 0.5 < y < 0.8, 
and thus v = 1 and Eq. (3) becomes: 

j = (i + 1) mod 2. 

If the source emits four different symbols, S4 = {81,32,3^,34}, then f = if 
the orbit lands on [0.2,0.35), f = 1 if the orbit lands on [0.35,0.5), -u = 2 if 
the orbit lands on [0.5,0.65), and f = 3 if the orbit lands on [0.65,0.8). 

The generalization for higher order sources is immediate. Even when multiple 
pairs are swapped at each encryption run, as in [5], the look-up table evolution 
is easily predicted. As a consequence, the look-up table plays no significant 
security role during the encryption process. It is not necessary to know the 
exact value of y to predict the next update. It suffices to know the subinterval 
where y lands. Therefore, the updated table depends solely on the plaintext, 
and not on the key {yo and b), to the advantage of the attacker. When en- 
crypting the same plaintext using different keys, the same updating sequence 
will take place for the look-up table. Likewise, the same message will always 
yield the same hash value regardless of the key used. 



?/-0-^ 
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3.2 How to obtain the keystream 

In this subsection different attacks are described. Each of them aims at the 
recovery of the keystream. 

3.2.1 Chosen cipheriext attack 

The chosen ciphertext attack is straightforward. Simply request the plaintext 
of the one-block ciphertexts c = 1, c = 2, c = 3, until the desired length 
of the keystream is reached. Either the correct symbol (when the iterate lands 
on a site) or an error (when it lands outside the boundaries) is obtained, 
one by one. Once the desired length of keystream has been recovered in this 
way, any message encrypted with the same values of Uq and b can easily be 
decrypted. Under this attack, the dynamically updated look-up table has no 
effect at all. This attack requires as many one-block ciphertexts as the length 
of the keystream that is to be recovered. This attack does not work on [5], 
because the attacker does not know the encrypted values of p and r, the first 
two blocks of the ciphertext. 

3.2.2 Chosen plaintext attack 

Let us deal with the chosen plain text attack next. To make things even simpler 
at first, we assume the fourth order symbol source S4 and that rmax = 1 (see 
[4]). Although unknown to the cryptanalyst, the system key K is given by 
yo = 0.1777 and b = 3.9999995, using the interval [0.2,0.8), as in [4]. 

Our goal is to find out the exact position of all occurrences of Si, 52,^3, and 
S4 in the keystream. But this task is not as easy as requesting the ciphertext 
of p = SiSiSi . . . , then of p = S2S2-S2 • • • , etc., as in [7]. The dynamic look-up 
table prevents knowing whether a certain symbol corresponds to its original 
position in the table or has been already changed. But given that the changes 
produced in the table arc known even when the exact value of y is unknown, 
the following attack can be designed. 

First, in order to know the exact position of all symbols Si in the keystream 
k, we need to construct an adequate plaintext p. Table 1 represents the final 
result of the process followed to compute the correct value of p. We start 
constructing Table 1 by filling in columns i and ki, already known in advance. 
Columns 0, 1, 2, and 3 reflect the current state of the look-up table, i.e., which 
symbol is at which position at any given moment. Next, we proceed row by 
row, in the following way: 

(1) Assign to Pi the value of the symbol (si, S2, S3, or S4) in the previous row 
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in the subinterval corresponding to ki. At start, the look-up table is not 
yet altered. 

(2) Calculate j = {i + h) mod 4. 

(3) Update the look-up table by interchanging the symbols in the subintervals 

(4) Proceed to the next row. 

After proceeding in this way, the plaintext that corresponds to all symbols si 
in the keystream is worked out: p = . . . This plaintext is always peri- 
odic. The corresponding ciphertext is requested, obtaining: c = 10 9 6 6 7 ... . 

Hence, it is known for sure that there is a true symbol Si at the 10th 
position, and at the 19th, etc. The partial keystream already obtained is 

/y rv* rri rp ry y y' 7* ''Y' Q -, ' f^ 'Y" 7 * 'Y >• 'Y 7 * / ' Q ^ Y' 7* Y' IT" T' C i 'Y' Y* T' IT" T' C i T* If IT" If IT If Q i 

JU <Xj tXj •Xj tJy •Jj %KJ iX/ %Aj Oy^kj %Aj •Xj tXj •Xj tJy •Jj %Aj O y \ •Xj <Xj tXj O ]^ it' tXj tXj tX/ <Xj tXj o ]^ . . . 

Next, we are to obtain the exact position of all symbols S2 by con- 
structing Table 2. This tabic informs us that wc have to request the ci- 
phertext of p = 1022203330 1 1..., obtaining c = 
4 11 14 5... The improved partial keystream already obtained is k = 
XXXS2XXXXXS1XXXXS2XXXS1XXXXXS1XXXS2XS1XXS2XXXS1 . . . 

If we repeat this process, generating Table 3 and 4 for symbols S3 and S4, 
and requesting the corresponding ciphertexts, we would obtain the following 
complete keystream: 

k = SzXXS2XXSzXXSiSiSiSi^XS2XXXSiSiSiXSzXSiSiSiXS2XSiSiXS2XXXSi ... (4) 

The generalization for higher order sources is immediate. This attack is very 
inexpensive too, since it only requires N — 2"^ plaintexts, 1 < n < 8. 

When Tmax > 1, the same procedure must be followed. However, there will 
be blanks in the recovered keystream, because many valid iterations will be 
skipped. In many cases, though, it is possible to narrow down the number 
of possible symbols. Once the partial keystream has been worked OTit, while 
trying to decrypt a ciphertext following the decryption method, iterates will 
land on an x. The only possible symbols for those x are those lying before 
the X at a distance smaller than fmax- In this way, the possibilities are greatly 
reduced, in many cases to only one possible value (the correct one). When more 
than one value is possible, there are two courses of action. If the plaintext is not 
of random nature, then the gaps can easily be filled selecting by the context 
one symbol amongst the possible ones. On the other hand, if the plaintext is 
random, then a new plaintext must be requested, made by all the previous 
correct symbols plus one of the candidates. If the ciphertext is equal to the 
expected one, then the guess was correct. Otherwise, new plaintexts must be 
encrypted until the correct guess is used. 

There is still another possible modification presented in [4]. A new parameter. 
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called threshold (^threshold), can be introduced along with the key. The current 
value of y can be checked against this threshold, so that the table is updated 
only when y > ^threshold- However, although it is assumed that this addition 
improves security, it is very easy to deduce which is the subinterval to which 
^threshold belongs. It Can be observed in Table 1 that when the plaintext symbol 
to be encrypted is si, the table is never effectively updated. After the N = 2^ 
tables are constructed, if i/thrcshoid > ymm, then there must occur repeated 
values in the recovered keystream. Given that the position of Si is always 
correct in the keystream, we know for sure that symbols St for t > 1 which 
coincide with a previous Si symbol must be incorrect. The greatest value of 
t indicates which is the symbol ^/threshold belongs to. This attack works on [5] 
too. The attacker can still predict the evolution of the look-up table with the 
only knowledge of the plaintext. 

3.2.3 Known plaintext attack 

Under this attack, each plaintext/ciphertext pair allows for the recovery of 
a portion of the keystream. For the sake of simplicity and without loss of 
generality, let us use once again the 6*4 source, Tmax = 1, and ^threshold — 
yrmn- Let US sct p = 1 3 . . . , whose ciphertext is c = 4 11 5 9 5 . . . 
Table 5 can be easily constructed, which allows to recover a correct portion of 
the keystream: k = xxxs2xxxxxxxxxxs2xxxxs/^xxxxxxxxs2xxxxs2 ■ ■ ■ This 
process should be repeated with as many known plaintexts as possible, to 
recover as big a portion of the keystream as possible. Therefore, this attack 
does not guarantee total recovery of the keystream. It would work in a similar 
way for [5]. 



3.3 How to decrypt using the keystream 

In the previous subsection, different methods to obtain the complete keystream 
where introduced. Next, we explain how to recover the plaintext from a ci- 
phertext when the keystream is known. 

For simplicity, the fourth order symbol source 5*4 is used. We assume the 
attacker already knows the keystream given by Eq. (4) and possesses the 
following ciphertext: c = 1 10444... In order to decrypt it. Table 6 is 
constructed in the following way. First, fill all the values of ki with the symbol 
found in the keystream at the positions indicated by the ciphertext. Next, 
according to the current state of the look-up table, assign the correct value 
to Pi which corresponds to each ki. Calculate the value of j and update the 
look-up table accordingly. Move to the next row and repeat the process until 
the ciphertext has been exhausted. 



7 



4 Security of the hash 

We have tested that breaking the hash algorithm is possible when p—1 and 
r — 1, even without the knowledge of the key {yo and b). Let us consider for 
example the following message: "Transfer $10000 to Alvarez's account.". If 
encoded using 4-bit symbols, its hash is h = 1E825BC0A36974FD, expressed 
in hexadecimal. Changing the message into "Transfer $30005 to Alvarez's ac- 
count." produces exactly the same hash. In order to avoid attacks on the 
hashing scheme, it is all important that r > 1 and p > 1. In effect, in [5] 
two runs and a small value of p are suggested. Greater values would increase 
security, but penalize on speed. 

Although in [5] it is claimed that this hash can be treated as a message authen- 
tication code (MAC), in fact it can not. A MAC is a key-dependent one-way 
hash function. However, as already proved, the look-up table, and hence the 
hash, does not depend on the key {yo and b). Therefore, this scheme does not 
behave as a MAC but as a one-way hash function, even though the knowledge 
of the key is necessary to verify the hash when both authenticity and secrecy 
are to be provided. 



5 Conclusions 

In spite of dynamically updating the look-up table, the same fundamental 
weakness present in Baptista's algorithm [1] is reproduced in the chaotic cryp- 
tosystems proposed in [4,5], as proved by our different attacks. As a conse- 
quence of our attacks, an important conclusion is that implementations of 
these algorithms can never reuse the same key because if so, they are easily 
broken. Furthermore, the look-up table does not depend on the key, but only 
on the plaintext, thus facilitating crypt analysis. After these attacks, we con- 
clude that the lack of security, along with the low encryption speed, discourage 
the use of these algorithms for secure applications. We are to investigate how 
the weaknesses outlined in this Letter might affect the security of other Wong's 
variants [6]. 
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Table 1 

Plaintext (pj) needed to find out the exact position of symbols si {ki = si = 0). 

i j 1 2 3 ki Pi 

- - Si S2 S3 S4 

Si S2 S3 S4 Si 

1 1 Si S2 S3 S4 Si 

2 2 Si S2 S3 S4 Si 

3 3 Si S2 S3 S4 Si 

Table 2 

Plaintext (pj) needed to find out the exact position of symbols S2 {ki = S2 = 1). 
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S4 









1 


S2 


Sl 


S3 


S4 


1 


S2 


1 


2 


S2 


S3 


Sl 


S4 


1 


Sl 


2 


3 


S2 


S3 


S4 


Sl 


1 


S3 


3 





Sl 


S3 


S4 


S2 


1 


S3 
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Table 3 

Plaintext (pj) needed to find out the exact position of symbols S3 {ki = S3 = 2). 

i j ] 2 3 p-i 

- - Sl S2 S3 S4 

2 S3 S2 Sl S4 2 S3 

1 3 S3 S4 Sl S2 2 Sl 

2 Sl S4 S3 S2 2 Sl 

3 1 Sl S2 S3 S4 2 S3 
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Tabic 4 

Plaintext (pi) needed to find out the exact position of symbols S4 (fcj = S4 = 3). 
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Table 5 

Partial keystream (A;,) recovered when both plaintext {pi) and ciphertext (cj) are 
known. 

/ J 1 2 3 k, p, r, 

- - Sl S2 S3 S4 - - - 

1 S2 Sl S3 S4 S2 = 1 S2 4 

1 2 S2 S3 Sl S4 S2 = 1 Sl 11 

2 1 S2 Sl S3 S4 S4 = 3 S4 5 

3 S4 Sl S3 S2 S2 = 1 Sl 9 
4 = 1 Sl S4 S3 S2 S2 = 1 Sl 5 
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Table 6 

Plaintext (pj) recovered when the keystream (fcj) is known. 

i j 1 2 3 ki Pi 



- - Si S2 S3 S4 - 

2 S3 S2 Si S4 S3 = 2 S3 

1 S2 S3 Si S4 S4 = 3 S4 

2 3 S2 S3 S4 Si S2 = 1 S3 

3 3 S2 S3 S4 Si Si = S2 

4 = 2 S4 S3 S2 Si S3 = 2 S4 
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